Language identification on code-switching utterances using multiple cues

نویسندگان

  • Dau-Cheng Lyu
  • Ren-Yuan Lyu
چکیده

Code-switching speech is an utterance containing two or more languages. Usually, the switching linguistic unit is in clause or word levels. In this paper, a two-stage framework is proposed, containing a language identifier and then a speech recognizer, to evaluate on a Mandarin-Taiwanese codeswitching utterance. In the language identifier, we use multiple cues including acoustic, prosodic and phonetic features. In order to integrate the cues to distinguish one language from another, we used a maximum a posteriori decision rule to connect an acoustic model, a duration model and a language model. In the experiments, we have achieved 34.5% (LID) and 17.7% (ASR) error rate reduction comparing with one stage LVCSR-based system.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Modeling code-Switching speech on under-resourced languages for language identification

This paper presents an integration of phonotactic information to perform language identification (LID) in a mixed-language speech. A single-pass front-end recognition system is employed to convert the spoken utterances into a statistical occurrence of phone sequences. To process such phone sequences, a hidden Markov model (HMM) is utilized to build robust acoustic models that can handle multipl...

متن کامل

Developing Language-tagged Corpora for Code-switching Tweets

Code-switching, where a speaker switches between languages mid-utterance, is frequently used by multilingual populations worldwide. Despite its prevalence, limited effort has been devoted to develop computational approaches or even basic linguistic resources to support research into the processing of such mixedlanguage data. We present a user-centric approach to collecting code-switched utteran...

متن کامل

The effect of Code switching on the Acquisition of Object Relative Clauses by Iranian EFL Learners

This study attempted to investigate the impact of teacher’s code-switching on the acquisition of a problematic grammatical structure, namely, object relative clauses, by intermediate EFL learners. Moreover, a secondary objective of the study was to determine the EFL learners’ attitudes and opinions regarding the effectiveness of teacher’s code-switching in their learning of a specific aspect of...

متن کامل

A Language Modeling Approach to Identifying Code-Switched Sentences and Words

Globalization and multilingualism contribute to code-switching – the phenomenon in which speakers produce utterances containing words or expressions from a second language. Processing code-switched sentences is a significant challenge for multilingual intelligent systems. This study proposes a language modeling approach to the problem of codeswitching language processing, dividing the problem i...

متن کامل

Motivational Determinants of Code-Switching in Iranian EFL Classrooms

“Code-Switching”, an important issue in the field of both language classroom and sociolinguistics, has been under consideration in investigations related to bilingual and multilingual societies. First proposed by Haugen (1956) and later developed byGrosjean (1982), the termcode-switching refers to language alternation during communication. Although code-switching is unavoidable in bilingual and...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008